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Abstract 

Subgraph Isomorphism is a very basic graph problem, where given two graphs G and H 
one is to check whether G is a subgraph of H. Despite its simple definition, the Subgraph 
Isomorphism problem turns out to be very broad, as it generalizes problems such as Clique, 
r-Coloring, Hamiltonicity, Set Packing and Bandwidth. However, for all of the mentioned 
problems 2 *^ 0 ) time algorithms exist, so a natural and frequently asked question in the past 
was whether there exists a 2®0) time algorithm for Subgraph Isomorphism. In the monograph 
of Fomin and Kratsch [Springer’lO] this question is highlighted as an open problem, among few 
others. 

Our main result is a reduction from 3-SAT, producing a subexponential number of sub- 
linear instances of the Subgraph Isomorphism problem. In particular, our reduction implies a 
2 n(nViogra) tower bound for Subgraph Isomorphism under the Exponential Time Hypothesis. 
This shows that there exist classes of graphs that are strictly harder to embed than cliques or 
Hamiltonian cycles. 

The core of our reduction consists of two steps. First, we preprocess and pack variables 
and clauses of a 3-SAT formula into groups of logarithmic size. However, the grouping is 
not arbitrary, since as a result we obtain only a limited interaction between the groups. In 
the second step, we overcome the technical hardness of encoding evaluations as permutations 
by a simple, yet fruitful scheme of guessing the sizes of preimages of an arbitrary mapping, 
reducing the case of arbitrary mapping to bijections. In fact, when applying this step to a 
recent independent result of Fomin et al.[CoRR abs/1502.05447 (2015)], who showed hardness 
of Graph Homomorphism, we can transfer their hardness result to Subgraph Isomorphism, 
implying a nearly tight lower bound of 


‘Institute of Informatics, University of Warsaw, cygEni@mimuw.edu.pl 
^Carnegie Mellon University, pachocki@cs.cmu.edu 

^Institute of Informatics, University of Warsaw, a.socala@mimuw.edu.pl 



1 Introduction 

Perhaps the most basic relation between graphs is that of being a subgraph. We say that G is a 
subgraph of H if one can remove some edges and vertices of H, so that what remains is isomorphic 
to G. Formally, the question of one graph being a subgraph of another is the base of the Subgraph 
Isomorphism problem. 

Subgraph Isomorphism 
Input: undirected graphs G, H. 

Question: is G a subgraph of H, i.e., does there exist an injective function g : V (G) ^ V(H), 
such that for each edge uv G E{G) we have g{u)g{v) G E{H). 

Subgraph Isomorphism is an important and very general question, having the form of a 
pattern matching - we will call G the pattern graph and H the host graph. Observe that several 
flagship graph problems can be viewed as instances of Subgraph Isomorphism: 

• Hamiltonicity(G): is G„ (a cycle with n vertices) a subgraph of G? 

• CLiQUE(G,k): is a subgraph of G? 

• 3-Coloring(G) : is G a subgraph of Kn,n,n, a tripartite graph with n vertices in each of its 
three independent sets? 

• VERTExCoVER(G,k) : is G a subgraph of H, H being a full join between a clique of size k 
and an independent set of size n — kl 

One can continue showing the richness of Subgraph Isomorphism by simple linear reductions 
from Bandwidth, Set Pagking and several other problems. 

All of the mentioned problems are NP-complete, and the best known algorithms for all the 
listed special cases work in exponential time. In fact, all those problems are well-studied from the 
exact exponential algorithms perspective [3, 4, 5, 6, 7], where the goal is to obtain an algorithm of 
running time 0(c”) for smallest possible value of c. Furthermore, the Subgraph Isomorphism 
problem was very extensively studied from the viewpoint of fixed parameter tractability, see [16] for 
a discussion of 19 different possible parametrizations. All the mentioned special cases of Subgraph 
Isomorphism admit G(c") time algorithms, by using either branching, inclusion-exclusion principle 
or dynamic programming. On the other hand, a simple exhaustive search for the Subgraph 
Isomorphism problem - numerating all possible mappings from the pattern graph to the host 
graph ~ runs in time, where n is the total number of vertices of the host graph and 

pattern graph. 

Therefore, a natural question is whether Subgraph Isomorphism admits an G(c”’) time al¬ 
gorithm. This was repeatedly posed as an open problem [1, 2, 8, 9, 11]. In particular, Fomin 
and Kratsch in their monograph [10] put the existence of 0{c'^) time algorithm for Subgraph 
Isomorphism among the few questions in the open problems section. 

Our results and techniques Our main result is a reduction which transforms a 3-SAT formula 
into a subexponential number of sublinear instances of the Subgraph Isomorphism problem. This 
implies that a 0(c”) time algorithm for Subgraph Isomorphism would imply a sub exponential 
algorithm for 3-SAT, thus refuting the Exponential Time Hypothesis of Impagliazzo, Paturi and 
Zane [12, 13]. The Exponential Time Hypothesis is an established assumption; several interesting 
lower bounds have been found under this conjecture (see [14] for a survey). 

Theorem 1.1 There is no algorithm which solves Subgraph Isomorphism in time, 

unless the Exponential Time Hypothesis fails. 
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Our reduction can be broken into three steps: 

• First, in Section 4, we preprocess the given 3-SAT formula and pack its variables and clauses 
into groups of logarithmic size. Importantly, we ensure that there is only a limited interaction 
between the groups by marking variables with colors - applying further steps of the reduction 
for an arbitrary grouping would not yield a super exponential lower bound for Subgraph 
Isomorphism. 

• Next, in Section 5, we use the packing to create smaller instances of a variant 

of the Subgraph Isomorphism problem, where additionally vertices and edges have colors 
which have to be preserved by the mapping. This proves that the color variant of Subgraph 
Isomorphism admits a tight lower bound of under the Exponential Time Hypoth¬ 

esis. In this step, we use a simple technique of guessing preimage sizes, which allows us to 
circumvent the usual technical difficulties of encoding valuations by permutations. 

• Finally, in Section 6 we reduce the color version of Subgraph Isomorphism to the original 
variant, incurring an 0{y/logn) increase in the instance size. 

We would like to note that very recently and independently, Fomin et al. [9], in an unpub¬ 
lished work, proved that under the Exponential Time Hypothesis there is no 2"("' time 
algorithm for a related problem called Graph Homomorphism. Graph Homomorphism has a 
similar definition to Subgraph Isomorphism, except that the mapping is not constrained to be 
injective (i.e., in a homomorphism many vertices of the pattern graph may be mapped to the same 
vertex of the host graph). One could think that Graph Homomorphism is a harder problem than 
Subgraph Isomorphism, as for example in [2] Amini et al. have shown that counting subgraphs 
can be reduced to counting homomorphisms. In fact, Fomin et al. [9] in their work about Graph 
Homomorphism mention the question about Subgraph Isomorphism as an open problem. 

Theorem 1.2 [9] There is no algorithm which solves Graph Homomorphism in 2°("'V log log h) 
time, where h = O{poly{n)) is the size of the host graph and n is the size of the pattern graph, 
unless the Exponential Time Hypothesis fails. 

In Section 7 we prove that by applying our simple scheme of guessing preimage sizes, one can 
transform an instance of Graph Homomorphism into an exponential number of instances of 
Subgraph Isomorphism. 

Theorem 1.3 Given an instance {G,H) of Graph Homomorphism one can in O{2'^poly{n)) 
time create 2” instances of Subgraph Isomorphism with n vertices, where n = |H(G')| -|- \ V{H)\, 
such that {G, H) is a yes-instance iff at least one of the created instances of Subgraph Isomor¬ 
phism is yes-instance. 

Note that Theorem 1.3, when combined with the lower bound of Fomin et al. quoted in 
Theorem 1.2, implies a stronger lower bound for SUBGRAPH Isomorphism. 

Corollary 1.4 There is no algorithm which solves Subgraph Isomorphism in 2"(’^ 
time, unless the Exponential Time Hypothesis fails. 

2 Preliminaries 

Notation We use the convention [k] = {0, ... ,k — 1}. All the graphs used in this article are 
undirected, however in edge colored graphs there might be several parallel edges between the same 
pair of vertices. We use standard graph notation - for an undirected graph G, by V{G) we denote 
the set of vertices of G, whereas by E{G) we denote the set of edges of G. 
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For a CNF-SAT formula (p let Var((^) be the set of variables of p, whereas Clauses((/?) is the 
set of clauses of p. 

By saying that two instances /, I' of some decision problems P and Q, respectively, are equiv¬ 
alent, we mean that / is a yes-instance of the problem P iff I' is a yes instance of the problem Q. 
In particular two formulas are equivalent iff they are either none of both of them are satisfiable. 

To simplify the reduction we use the standard method of transforming a 3-SAT formula into an 
equivalent formula with exactly three different variables in each clause and each variable occurring 
in at most 4 clauses. 

Lemma 2.1 [18] Given a 3-SAT formula p with m clauses one can transform it in polynomial 
time into a formula p' with 0{m) variables and 0{m) clauses, such that p' is satisfiable iff p' is 
satisfiable, and moreover each clause of p' contains exactly three variables and each variable occurs 
in at most 4 clauses of p'. 

Exponential Time Hypothesis The Exponential Time Hypothesis, introduced by Impagliazzo, 
Paturi and Zane [12, 13], states that it is impossible to solve 3-SAT in time subexponential in the 
number of variables. Note that the 0*() notation suppresses polynomial factors. 

Conjecture 2.2 (Exponential Time Hypothesis [13]) There exists a constant c > 0, such 
that there is no algorithm solving 3-SAT in time 0*(2'^”'). 

One of the reasons why the Exponential Time Hypothesis became a robust tool for proving 
lower bounds is the Sparsification Lemma, which allows to reduce the number of clauses in a 
formula to be linear in the number of variables. 

Lemma 2.3 (Sparsification Lemma [12]) For each e > 0 there exist a constants c^, such that 
any 3-SAT formula p with n variables can be expressed as p = where t < 2^'^ and each ifi 

is a 3-SAT formula with the same variable set as p, but contains at most c^n clauses. Moreover, 
this disjunction can be computed in time 0*(2^"'). 

3 Overview 

We define the size of a Subgraph Isomorphism instance to be the total number of vertices in 
the pattern and host graphs. 

Definition 3.1 We define the (c, t)-S ubgraph Isomorphism problem as a generalization of Sub¬ 
graph Isomorphism where every vertex of the pattern and host graphs is colored in one of c colors, 
and every edge is colored in one oft colors, and the mapping is restricted to preserving vertex and 
edge colors. 

In particular, SUBGRAPH ISOMORPHISM is the same as (1, 1)-Subgraph Isomorphism. The 
pipeline of our lower bound consists of two steps. First, in Lemma 3.2, given a 3-SAT formula with 
n variables we construct a set of instances of (0(1), 0(log n))-SuBGRAPH Isomorphism 

of 0(n/logn) size each. Note that the number of vertex colors is constant, whereas the number 
of edge colors is logarithmic. In the second step (Lemma 3.3) we reduce to the original variant of 
Subgraph Isomorphism, with an additional increase in the instance size by a factor of 0(-v^log n), 
leading to a final size of 0{n/^/logn), which is sub linear. 
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Lemma 3.2 Given a 3-SAT formula ip with n variables, where each variable occurs in at most 4 
clauses and each clause involves exactly three variables, one can in time create a set S 

gj 20 {n/iogn) instances 0/(0(1), C>(log n))-SuBGRAPH Isomorphism of size 0{nf logn), such that 
p is satisfiable iff any instance in S is satisfiable, and the host graph and the pattern graph have 
the same number of vertices for every instance in S. 

Lemma 3.3 An instance 0/(c, t)-S ubgraph Isomorphism, where the host graph and the pattern 
graph have the same number of vertices, can be reduced to an equivalent instance of Subgraph 
Isomorphism with 0{cy/t) times more vertices. 

Having the two lemmas above, which we prove in the remainder of this paper, we can prove 
Theorem 1.1. 

Proof of Theorem 1 . 1 : Assume that a time algorithm exists for the Subgraph 

Isomorphism problem, where n = |H(G)| + \ V { H )\. For a given e > 0, we show an algorithm 
solving a given 3-SAT formula p with n variables in time 0*(2^^"'), leading to a contradiction with 
the Exponential Time Hypothesis. 

First, we sparsify the formula using Lemma 2.3 to obtain formulas f>i, each with n 

variables and 0{n) clauses (where the hidden constant depends on e). Consider each i/’j indepen¬ 
dently. For a fixed fji, we use Lemma 2.1 to obtain an equivalent formula with 0{n) variables and 
clauses, with the additional property that each clause involves exactly three variables and each vari¬ 
able appears in at most 4 clauses. Consequently, the prerequisites of Lemma 3.2 are satisfied, and 
in time we can obtain a corresponding set S of instances of (0(1), 0(log n))- 

SuBGRAPH Isomorphism of size 0(n/logn) each. Next, we apply Lemma 3.3 to transform each 
instance in S into an instance of Subgraph Isomorphism of size 0{n/^/logn), obtaining the set 
S'. Finally, we apply the hypothetical 2°("'^^°®’^)-time algorithm to the instances in S', leading to 
20 (n/iogn) 2 o(n) _ 2 o(n) pnnning time. Note that the total running time is 0*(2'^’^) • 0*(2'^") • 2"("'\ 
which is not more than 0*(2^'"'^), as promised, hence the theorem follows. ■ 

We prove Lemma 3.2 in Section 5 and Lemma 3.3 in Section 6. However, before we describe 
the reduction, in Section 4 we present how to group clauses of a given 3-SAT formula in a way 
that allows a sublinear reduction to Subgraph Isomorphism. 

4 Grouping clauses 

As we already mentioned, when proving super exponential lower bounds based on the Exponential 
Time Hypothesis, we need to come up with a reduction producing an instance of Subgraph 
Isomorphism of sublinear size. In this section we show how to preprocess a given 3-SAT formula 
and partition its clauses into groups of logarithmic size. Our grouping is far from arbitrary, as we 
need to precisely control the interactions between clauses sharing the same variables. 

Before we arrive at our main structural lemma, we need a simple step in which we assign colors 
to variables so that no clause contains two variables of the same color and moreover the counts of 
variables in each color are balanced. The proof of the following Lemma is contained in Appendix A. 

Lemma 4.1 (4|k) Given an integer k > 9 and a 5—SAT formula p with n variables, where each 
variable occurs in at most 4 clauses, we can color the variables of p in polynomial time using at 
most k colors, so that no more than \n/{k — 9)] variables share the same color and no clause 
contains two variables of the same color. 

Having Lemma 4.1 we are ready to pack the clauses of a given 3-SAT formula into 2^ groups, 
which is the main structural insight in our reduction. It is important that no two clauses from the 
same group contain variables of the same color. 
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Lemma 4.2 Given a 5—SAT formula (p with n > 16 variables, such that each clause involves 
exactly three variables and each variable occurs in at most 4 clauses, one can in polynomial time 
construct: 

• a coloring I : Var((^) ^ [k] of the variables in (p into k colors, such that no two variables 
contained in a clause of (p share the same color, and 

• a packing h : Clauses((^) —>• [2^] of the clauses into 2^ groups indexed by {Q,... ,2^ — 1}, such 
that for any i G [2^] no two clauses that are mapped to i contain variables of the same color, 

where k := [logn — log log n] + 9. 

Proof: Let I be the coloring guaranteed by Lemma 4.1. We slightly overload the notation and 
by 1{C) denote the set of colors of variables in C G Clauses((^). 

We construct the packing h in a greedy manner. Consider all the clauses of Clauses((/9) one by 
one in an arbitrary order. When a clause C G Clauses(<y9) is processed, we find any group i G [2^], 
such that the set of colors of variables appearing in clauses already assigned to i is disjoint from 
1{C). If several such sets i exist, we pick an arbitrary one and assign h{C) := i. 

It remains to prove that such an i always exists for the value of k as stated in the lemma. We 
prove this by contradiction: suppose that at some point, for some clause C, for every i one of the 
colors in 1{C) is already present in a clause already assigned to i. Let be the number of 

clauses of tp containing at least one color from 1{C). As there are exactly 2^ groups, and we cannot 
assign C to any of them, it means that 

> 2^ > 512n/logn, (4.1) 

since each of the 2^ groups is blocked by a different clause containing at least one color from 1{C). 

On the other hand we have only 3 colors in 1{C) and we know by Lemma 4.1, that no more 
than \n/{k — 9)] variables are assigned to any color, and by the upper bound on the frequency 
of each variable of p we know that no variable occurs in more than 4 clauses. Consequently, the 
number of clauses having at least one common color with C is upper bounded by 

^ 3 • \n/{k — 9)] • 4 < 12 • {n/{k — 9) + 1) 

< 12 • (n/(log n — log log n) + 1) < 12 • (2n/ log n + 1) 

< 12 • (2n/ log n + 0.5n/ log n) < 30n/ log n , (4.2) 

where in the last two inequalities we have used that logn — log log n > 0.5 log n and n/logn > 2 
for n > 16. Note that (??) yields a contradiction with (??), and the lemma follows. ■ 

5 Prom 3-SAT to Subgraph Isomorphism with colors 

The technical crux of our result is a method of encoding information in permutations - mappings 
from the pattern graph to the host graph. The intuition behind this technique is that the number 
of permutations of an n element set is n! = and therefore a single permutation carries 

0(nlogn) bits of information. This means that from the information-theoretic perspective if 
should be possible to encode an assignment of Boolean values to n variables using a permutation 
of 0(n/logn) elements. 

Every element in a permutation is responsible for encoding some number of bits, forming what 
we call a pack of bits. We do not restrict ourselves to packs of constant size, but each pack we 
create is of size no greater than logarithmic. The position of an element in a permutation should 


5 


uniquely determine the values of all the bits from its pack. The problem is, however, that it in a 
permutation no two elements can be mapped to the same position, which potentially might make 
it impossible to assign the same valuation to two different packs of bits. 

Here, we present a new and simple way of circumventing this obstacle by guessing the sizes of 
preimages in a mapping corresponding to a satisfying assignment. Less formally, what we do is 
replicate some positions and remove other ones, so that in some branch our guess will transform a 
mapping we had in mind into a permutation. 

We would like to note that encoding groups of bits by a position in a permutation was already 
used by Marx, Lokshtanov and Saurabh [15] in the k x /c-Permutation Clique problem, as well 
as by Socala [17] in the lower bound for the Channel Assignment problem. Both of these two 
reductions (especially Lemma 2.3 from [17]) could be simplified when using our guessing preimage 
sizes approach, instead of a technical one-to-one reduction. 

In the remainder of this section we prove Lemma 3.2, that is show how to transform a 3-SAT 
formula (f into instances of (0(1),0(log n))-SuBGRAPH Isomorphism with 0(n/logn) 

vertices. In order to do this we need to introduce notation for binary strings. Assume for a 
moment, that n is a power of two, i.e., n = 2^ for k gN. One can view elements in a permutation 
as integers between 0 and n — 1 , denoted as [n], but also as a set of binary strings of length k - being 
the binary representations of numbers from [n], denoted as 21^1. We will use the two conventions 
interchangeably and for this reason we need the following notation regarding binary strings. Let 
B := { 0 , 1 }* be the set of all binary strings, and B^ '■= { 0 , 1 }^ be the set of binary strings of size 
exactly k. For a binary string s, let |s| be its length. We denote the i-th digit (starting from 0) of 
a binary string s as Sj. 

Proof of Lemma 3 . 2 : Assume we are given a formula ip with n variables, such that each 

clause involves exactly three variables and each variable appears in at most 4 clauses. Define 
k := [logn — log log n] -|- 9. We prove that solving (p can be reduced to solving less than 2^*^^^ = 
20 (n/iogn) instances of (3, fe)-SuBGRAPH Isomorphism, with vertex colored denoted as red, green 
and blue, and edge colors denoted by [k], where the number of vertices of both the pattern and 
host graph equals 


2 ^ -h 8 • + l = 0 {n/ log n). 

Satisfying assignment gadget. 

The assignment gadget G consists of a path on 8 • ( 3 ) red vertices, with a single green vertex 
appended at one end. The red vertices will be uniquely identifiable based on the distance from 
the green vertex. Each red vertex will correspond to a choice of 3 distinct indices from [k] and an 
assignment of binary values to each of them: 


ih,i 2 ,h,bi,b 2 ,b 3 ) G [kf x B 3 , 

H < *2 < *3 • 

Intuitively, an edge between one of the clause vertices and a red vertex will indicate that ‘in this 
pack of clause valuations, the variables at positions ii, i 2 D 3 are not assigned values 61 , 62 , 63 at the 
same time’. All edges in G are of color 0. 

Pattern graph construction. 

The pattern graph will be constant across all the created instances. The pattern graph P consists 
of 2 ^ blue vertices corresponding to packs of clauses and a copy of the satisfying assignment gadget 
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Figure 1: A simplified view of the pattern graph. 


G. First, find the coloring I and packing h guaranteed by Lemma 4.2. We associate each blue 
vertex of H with a different group in the image of h. 

For every variable x in S' and every two distinct clauses Ci,C '2 containing x, we add an edge 
of color l{x) between the blue vertices corresponding to h{Ci) and h{C2)- Intuitively, these edges 
signify that x has to have a consistent valuation when choosing valuations of variables in packs 
containing Ci and C2- 

Additionally, for every clause C in (y9 we add an edge of color 0 between h{C) (i.e., the pack 
containing C) and the red vertex (fi, ^ 2 , fsj ^ 2 ) where < ^2 < fs are the colors of variables 
in C and ( 6 i, 62 ) ^s) is their only valuation that does not satisfy C. 


Host graph construction. 

We will generate a different host graph for every sequence of preimage sizes of the valuations of 
the groups. Fix a sequence sqj si,..., S 2 fc_i, such that s* > 0 for all i and Y^Si = 2^. 

The number of possible such sequences s is 


/2fc+i _ i\ 

V 2*^ - 1 J 


< 2 


2 fe+l 


The host graph Hg consists of 2^ blue vertices corresponding to valuations of the groups of 
clauses and a copy of the satisfying assignment gadget G. For the binary string of length k 
corresponding to f G [ 2 ^], we generate Si vertices corresponding to it. 

For j G Zk, we join two blue vertices u,v in H with an edge of color j iff Uj = Vj, that is iff 
the j-th bit in both strings is the same. Intuitively, lack of an edge of color j between two blue 
vertices u,v in H disallows assigning two packs of clauses to vertices u and v when the variable of 
color j in both packs is the same, as it would lead to inconsistent valuation. 

For every blue vertex u in H and red vertex v = (ii, ^ 2 ) *3, ^3) in G, we connect u and v 

with an edge of color 0 iff Uj. ^ bj for some j G Z 3 . Less formally, lack of an edge between a blue 
vertex u and a red vertex v = (ii, ^ 2 ) * 3 , &i) ^ 2 , ^ 3 ) means that a pack of clauses can be assigned to 
u, only if the valuation corresponding to the bit string associated with u only if there is no clause 
such that assigning values 61 , 62,^3 to variables of colors ^ 1 ,^ 2 , * 3 , respectively, would cause come 
clause from the pack to be unsatisfied. 

Proof of correctness 3-SAT. 

As the construction can be carried out in polynomial time per instance and both the host and 
pattern graphs have 0(n/iogn) vertices as promised, it remains to prove that (p is satisfiable iff 
for some instance the pattern graph P is a subgraph of the host graph Hg. 
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Claim 5.1 If ip is satisfiable, then for some sequence of preimage sizes s, P is a subgraph of Hg. 

Proof: First, assume that (p is satisfiable and let val : Var(y?) — ^ {true, false} be a satisfying 
assignment. We construct a mapping g : V{P) —)■ V{H) as follows. For a group i G [2^] let Varj 
be the set of variables occurring in all the clauses assigned to i by the packing h. If any colors do 
not occur in /(Varj), add arbitrary variables to Varj so that l(VaiCi) = [k]. Define f{i) = val|vari, 
i.e., the bit string representing valuation of variables from Varj by val. Let s be the sequence of 
preimage sizes of /. A bijection g corresponding to / exists between the blue vertices of P and Hg- 
We extend g to all the vertices of P by mapping each vertex of the satisfying assignment gadget 
G in P to its corresponding copy in Hg, obtaining a bijection g' : V{P) — )• V{H). 

It remains to check that b' preserves all the edges. Clearly, the edges within the satisfying 
assignment gadget G are preserved. Consider any edge of color c G [k] in the pattern graph 
between two blue vertices u,v, corresponding to groups i and j. By construction, this means that 
the packs h~^{i) and h~^{j) share a variable of color c, which means that by the definition of / 
the bit strings f{i) and f{j) assign the same value to the index corresponding to this variable. As 
g extends /, we have g{i) = f{i) and g{j) = /(j), hence the bit strings corresponding to g'{u) 
and g'{v) have the same value on the c-th position, hence by construction of the host graph g'{u) 
and g'{v) are connected by an edge of color c. Finally, we inspect the edges between blue vertices 
and red vertices. Consider a blue vertex u associated with a set A C [k], which is connected to 
some red vertex v = (ii, ^ 2 Vs, & 2 , ^ 3 ), because of a clause G G h~^{A). As val is a satisfying 
assignment and g extends bit strings assigned by /, we infer that the vertex g(u) is connected to 
the red vertex v. Consequently, P is a subgraph of Hg witnessed by the mapping g'. j 

In Appendix B we prove the following claim. 

Claim 5.2 (4|fc) If for any s it holds that P is a subgraph of Hg, then p is satisfiable. 

Claims 5.1 and 5.2 prove equivalence of the formula p and created instances of (3, /c)-Subgraph 
Isomorphism, hence the proof of Lemma 3.2 follows. ■ 


We would like to note that Lemma 3.2 implies a tight bound for the auxiliary version of 
Subgraph Isomorphism with colors, even in the case when the number of vertex colors is constant 
and the number of edge colors is logarithmic. 

Corollary 5.3 There is no time algorithm for the (0(1), G(log re))-SuBGRAPH Isomor¬ 

phism problem, unless the Exponential Time Hypothesis fails. 

6 Removing the colors 

In this section we prove Lemma 3.3, first by showing how to remove colors from edges, and next 
by removing colors from vertices. Due to space constraints, we only sketch the constructions, and 
the formal proof of the equivalence of created instances is deferred to Appendix C. 

Lemma 6.1 (4|k) An instance {G, H) o/(c, t) -Subgraph Isomorphism suc/i t/iat |F(G)| = |F(iL)| 
can he reduced to an instance {G', H') of {c+1, 1)-Subgraph Isomorphism with 0{y/i) times more 
vertices such that |IL(G')| = \ V{H')\. 


Sketch of proof : Let (G, H) be an instance of (c, t)-SuBGRAPH Isomorphism such that |F(G)| = 
\V{H)\. Assume that none of the vertices of the instance {G,H) was colored yellow. Let t' := 
2 I "^/t \. Note that for t > 1 we have [\/t] > 1 and then 



2\Vt\ -(2 \y/t\ - 1 ) 


> 


Vt 


a 2 


> t. 





Figure 2 : The reduction described in Lemma 6 . 1 . In the example, t = 3, t' = 2 \Vt\ = 4 
p(red) = (1,2), p(green) = (1,3) and p(blue) = (1,4). 




Figure 3: The reduction described in Lemma 6.2. In the example, the green and blue colors of 
the vertices represent the numbers 2 and 3 respectively and the blue color of the edges represent 
the number 1 . 

Therefore for each color x G [t] we can pick a different pair p{x) := {i,j) where 1 < i < j <t'. 

For every vertex u in either the pattern or the host graph, we replace it by a gadget consisting 
of t' + 2 vertices (see Fig. 2): 

• a center vertex Uq of the same color as u, and 

• a path on t' + 1 yellow vertices u\,, rt(/_|_i, the first t' of which are connected to the center 
vertex. 

For every edge (u, v) of color x in either the pattern or the host graph, we replace it by the 
edges (n',u') and (n'-,u') in the modified graph, where {i,j) = p{x). We denote this new in¬ 
stance of {c+ 1, l)-SuBGRAPH Isomorphism as Note that |I^(G')| = {t' + 2) • |I^(G)| 

and \V{H')\ = {t' + 2) • \V{H)\ hence \V{G')\ = \V{H')\ and also |y(G')| = O(v^) • |■F(G)| and 
\V{H')\ = 0{y/i) ■ \V{H)\. In Appendix C we show that G is a subgraph of H iff G' is a subgraph 
of H'. m 

Having reduced the number of edge colors down to one, it remains to reduce the number of 
vertex colors. Note that in the following lemma it would be enough to assume t = 1, however we 
prove the lemma in a more general form as it does not affect the complexity of the proof. 

Lemma 6.2 (4|fc) An instance {G,H) o/(c, t) -Subgraph Isomorphism such that |H(G)| = \ V{H)\ 
can he reduced to an instance {G',H') o/(1, t)-SuBGRAPH Isomorphism with 0{c) times more 
vertices such that |IL(G')| = \ V{H')\. 

Sketch of proof : Let (G, H) be an instance of (c, t)-SuBGRAPH Isomorphism such that |H(G)| = 
\V{H)\. Number the vertex colors arbitrarily from 1 to c and number the edge colors arbitrarily 
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from 1 to t. We can assume that for every vertex color the number of the vertices in this color 
in G and in H is the same because otherwise we can produce a trivial NO instance as 
In both pattern and host graphs, for each vertex v, attach i + 1 new leaves vi,V2 ,--- to it, 
where i is the color of v, using edges of color 1 (or any fixed color from 1 to t). We also denote 
vq = V. Consider the (1, t)-SuBGRAPH Isomorphism instance {G', H') on the new graphs. For 
every vertex color the number of the vertices in that color in G is the same as in H and therefore 
the number of added leaves is the same in G' as in H'. Hence |H(G')| = \V{H')\. In Appendix C 
we show G is a subgraph of H iff G' is a subgraph of H\ ■ 

Proof of Lemma 3.3: The thesis follows directly from consecutive application of Lemmas 6.1 
and 6.2. ■ 

7 Prom Graph Homomorphism to Subgraph Isomorphism 


Graph Homomorphism 
Input: undirected graphs G, H. 

Question: Is there a homomorphism from G to H, i.e., does there exist a function h : V (G) —>• 
V{H), such that for each edge uv G E{G) we have h{u)h{v) G E{H). 

In this section we present a reduction which shows that one can solve the Graph Homo¬ 
morphism problem by solving instances of the Subgraph Isomorphism problem, 

demonstrating that the lower bound of of Fomin et al. [9] implies an 

2ii(niogn/iogiogn) bound Under the Exponential Time Hypothesis for the Subgraph Isomor¬ 
phism problem, where n = |P(G)| -|- \ V{H)\. 

Proof of Theorem 1.3: Let {G,H) be an instance of Graph Homomorphism and denote 
n = V{G) + V{H). Note that any homomorphism h from G to H can be associated with some 
sequence of non-negative numbers )|)t;ei/(H), being the numbers of vertices of G mapped to 

particular vertices of H. The sum of the numbers in such a sequence equals exactly |H(G)|. As 
the number of such sequences is enumerate all such sequences in time 

2”'poly(n). For each such sequence {av)vev(H) we create a new instance {G',H') of Subgraph 
Isomorphism, where the pattern graph remains the same, i.e., G' = G, and in the host graph H' 
each vertex of u G V{H) is replicated exactly times (possibly zero). Observe that \V{H')\ = 
\V{G% 

We claim that G admits a homomorphism to H iff for some sequence {av)v&v{H) the graph 
G' is a subgraph of H'. First, assume that G admits a homomorphism h to H. Consider the 
instance {G',H') created for the sequence = \h~^{v)\ and observe that we can create a bijection 
h' : V{G') —)• V{H') by assigning v G V{G') to its private copy of h{v). As /i is a homomorphism, 
so is E, and as h' is at the same time a bijection, we infer that G' is a subgraph of H'. 

On the other hand if for some sequence the constructed graph G' is a subgraph of 

H', then projecting the witnessing injection g : V{G') —)• V{H') so that g'{y) is defined as the 
prototype of the copy g{v) gives a homomorphism from G to H, as copies of each v G V{H) form 
independent sets in H'. ■ 
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A Missing proofs from Section 4 

Before we prove Lemma 4.1, we show the existence of a potentially unbalanced 9-coloring. 

Lemma A.l Given a 5—SAT formula (p with n variables, where each variable occurs in at most 

4 clauses, we can color the variables of ip in polynomial time using at most 9 colors, so that no 

clause contains two variables of the same color. 

Proof: Construct an auxiliary graph G^p, the vertex set of which is the set of variables of S, 
where two vertices of Gy, are adjacent iff they both appear in at least one of the clauses of (p. Note 
that the maximum degree of is bounded by 8 , as each variable appears in at most 4 clauses 
and each clause contains at most 3 literals. Consequently, we can color Gy, with at most 9 colors 
in a greedy manner. ■ 

Proof of Lemma f.l: First, color the variables into 9 colors using Lemma A.l. Then, while 
there exists a color with more than \n/{k — 9)] variables assigned to it, separate \n/{k — 9)] of 

them to form a new color. This can occur at most A: — 9 times, and the lemma follows. ■ 

B Missing proofs from Section 5 

Proof of Claim 5.2: Let g be a mapping from P to Hg witnessing the fact that P is a 

subgraph of Hg. As respects colors, we infer that the single green vertex in P is mapped to the 
single green vertex in Hg. Similarly all the red vertices of P have to be mapped to red vertices 
of Hg. Additionally the distance between each red vertex v and the green vertex in P cannot 
be smaller than the distance between g(u) and the green vertex in Hg. As red vertices induce a 
path, and the green vertex is pendant to one if its ends, we infer that g assigns each vertex of 
the satisfying-assignment-gadget in P to its copy in Hg (in short, by construction there are no 
non-trivial automorphisms of the gadget). 

Construct an assignment val : Var((/?) —)• {true, false} as follows. For a variable x G Var((^) 
find any clause G that contains x and assign val(x) to true iff g{h[C))px) = 1, where h{C) is the 
blue vertex associated with G and l{x) is the color of the variable x. Note that by construction 
the assignment val is well-defined, as edges between blue vertices guarantee consistence. Consider 
a clause G. The edges between h{C) and red vertices in the pattern graph P have to be preserved 
by g, and we already observed that g maps red vertices of P to their corresponding copies in Hg. 
Hence, we infer that there is an edge between g{h{C)) and the red vertex v = (ii, ^2, *3, &2, ^3), 

where 61 , 62,63 is the only assignment to variables of G, where 1{C) = {ii,i2,i'i}, which does not 
satisfy G. This in turn implies that for at least one variable of G the assignment val assigns a 
different value than the one corresponding to the appropriate bit from {61,62,63}. Consequently, 
val is a satisfying assignment. j 

C Missing proofs from Section 6 

Here, we present the missing parts of the proof of Lemmas 6.1 and 6.2 from Section 6 . 

Proof of Lemma 6.1: It remains to prove that G is a subgraph of H iff G' is a subgraph 

of H'. If G is a subgraph of H then there exists an injective function / : 14(G) —>• V{H) such 
that edges and colors are preserved. Note that every vertex of the instance {G',H') is of the form 
u[ for some vertex u of the instance {G,H). Let g : V{G') —>■ V{H') be a function such that 
g{u'f) = /(u)(. The function g is an injection because the function / is an injection. The function g 
preserves the colors of the vertices because col{g{uQ)) = col{f{uyQ) = col{f{u)) = col{u) = coI{uq) 
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and for i > 0 the color of u'^ is always yellow. The function g preserves also the edges. Let u'u'- 
be an edge in G' such that i < j (we can assume this w.l.o.g.). If u = v then there exists also an 
edge /(w)(/(ri)j = g{u[)g{v'j) in H' because all the gadgets have exactly the same structure of the 
internal edges i.e. if there exists an edge u'u'- in a gadget for any vertex u then for every vertex v 
there exists an edge v^v'j in a gadget for vertex v. If u 7 ^ u then 1 < i < j < t' and there exists an 
edge uv of the color x = G and then there exists an edge f{u)f{v) of the color x in LI 

and (because {i,j) = Pi^)) we know that there exists an edge /(w)(/(r')j = g{u^)g{v'^) in H'. The 
edges of have only one color thus g preserves the colors of the edges trivially. Hence G' is 

a subgraph of H'. 

If G' is a subgraph of H' then there exists an injective function g : V{G') —)• V{H') such that 
edges and colors are preserved. The vertices of the form u'q are the only vertices of G' and H' which 
are not yellow. Therefore if g{u^) = u'-, then i = 0 iff j = 0. If g^u^) = Vq then for every such that 
1 < i we have g(u^) = Vj for some 1 < j < f because the vertices u[,U2,, u'^, are yellow neigh¬ 
bors of Uq and the vertices v[,V2 ,... ,v^, are the only yellow neighbors of Vq. On the other hand we 
know that |H(G)| = |H(LI)| and then the number of the vertices of the form u[ for 1 < i < is the 
same in G' and in H'. Therefore for every vertex v[ in H' such that 1 < i < there exists a vertex «'■ 
in G' such that 1 < j <t' and g{u'j) = Therefore if g{u^) = Vj then i = iff j = More¬ 

over the vertices u'i,U2 ,..., u'^, create a path (in this order) and the only directed paths containing 
exactly the vertices = g{{u'i,U2 ,... ,u[,}) are v[,V2 ,..., v'^, and ..., 

But the vertex u[, is a neighbor of the vertex and the vertex has no neighbor of the form 

for any vertex w in H. On the other hand the vertex has to be mapped to a vertex of 
the form for some vertex w in H. Therefore the path u'i,U2 ,... ,u[, is mapped to the path 

v'i,V2 ,..., v[, i.e. for every vertex u[ such that 1 < i < we have g{u'j) = v[. Let / : V(G) —>■ V{H) 
be a function such that f{u) = u iff ^(uq) = Vq. (note that then /(m)o = Vq = ^(ug)). The 
function / is an injection because the function g is an injection. The function / preserves the 
colors of the vertices because col{f{u)) = co^(/(u)q) = col{g{uQ)) = coI{uq) = col{u). The function 
/ preserves also the edges with their colors because if there is an edge uv of the color x in the 
graph G then for (i, j) = p{x) there is an edge n'u'- in the graph G' and therefore there is an edge 
g{u[)g{vj) = f{u)^f{vyj in the graph H' and then there is an edge f{u)f{v) of the color x in the 
graph H. Hence G is a subgraph of H. ■ 

Proof of Lemma 6.2: If G is a subgraph of H then there exists an injective function / : 

V{G) — ^ y{H) such that edges and colors are preserved. Note that every vertex of the instance 
{G',H') is of the form Vi for some vertex v of the instance {G,H). Let g : V{G') —)• V{H') be a 
function such that g{vi) = f{v)i which is a correctly dehned function because col{v) = col{f{v)) 
and therefore vq has the same number of leaves in the graph G' as /(u)o in the graph H' . The 
function g is an injection because the function / is an injection. The function g preserves the colors 
of the vertices trivially. We show that the function g preserves also edges and their colors. Let 
assume that there is an edge UiVj for i < j (we can assume that w.l.o.g) of the color x in the graph 
G'. If j > 0 then u = v, i = 0 and x = 1 and there exists also an edge f{u)of{u)j = g{ui)g{vj) 
of the color 1 = x in the graph H' . Otherwise we have i = j = 0 and then there exists an edge 
uv of the color x in the graph G thus there exists an edge f{u)f{v) of the color x in the graph H 
hence there exists an edge f{u)of{v)o = g{ui)g{vj) of the color x in the graph H' . Therefore G' is 
a subgraph of G. 

If G' is a subgraph of H' then there exists an injective function g : V{G') —)• V{H') such that 
edges and colors are preserved. All vertices from the original pattern graph have to be matched 
to vertices from the original host graph, as they are the only ones of degree greater than 1 in the 


13 


new graphs. But the number of the vertices of the form uq is the same in G' as in H' because 
|y(G)| = \ V{H)\. Therefore for every vertex of the form vq in H' there exists a vertex of the form 
uq in G' such that g{uQ) = vq. Hence, the leaves have to map to leaves. But the number of leaves is 
the same in G' as in H'. Thus for every leaf Vi in H' there exists a leaf Uj in G' such that g{uj) = Vi. 
Hence all the leaves are used and then for every vertex uq in G' the number of leaves of uq in G' is 
the same as the number of leaves of g{uQ) in H'. Let us consider a function / ; V(G) —)■ V(H) such 
that f{u) = u iff g{uo) = vq (then f{u)o = vq = g{uo)). Note that / = g\v(G)- The function / is an 
injection because the function g is an injection. The function / preserves the colors of the vertices 
because for every v in the graph G we have that vq has exactly col{v) +1 leaves as neighbors in the 
graph G' and then g{vo) = f{v)o has also exactly col{y) + 1 leafs as neighbors. But on the other 
hand the vertex /(u)o has exactly col{f{v)) + 1 leafs as neighbors in the graph H' and therefore 
col{f{v)) = col{v). The function / preserves also the edges with their colors because for every edge 
uv of a color x in the graph G there exists an edge uqVo of the color x in the graph G' and therefore 
there exists an edge g{uo)g{vo) = f{u)of{v)o of the color x in the graph H' and hence there exists 
an edge f{u)f{v) of the color x in the graph H. Therefore G is a subgraph of H. ■ 
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